BIDENS: Iterative Density Based Biclustering Algorithm With Application to Gene Expression Analysis

نویسندگان

  • Mohamed A. Mahfouz
  • M. A. Ismail
چکیده

Biclustering is a very useful data mining technique for identifying patterns where different genes are co-related based on a subset of conditions in gene expression analysis. Association rules mining is an efficient approach to achieve biclustering as in BIMODULE algorithm but it is sensitive to the value given to its input parameters and the discretization procedure used in the preprocessing step, also when noise is present, classical association rules miners discover multiple small fragments of the true bicluster, but miss the true bicluster itself. This paper formally presents a generalized noise tolerant bicluster model, termed as μBicluster. An iterative algorithm termed as BIDENS based on the proposed model is introduced that can discover a set of k possibly overlapping biclusters simultaneously. Our model uses a more flexible method to partition the dimensions to preserve meaningful and significant biclusters. The proposed algorithm allows discovering biclusters that hard to be discovered by BIMODULE. Experimental study on yeast, human gene expression data and several artificial datasets shows that our algorithm offers substantial improvements over several previously proposed biclustering algorithms. Keywords—Machine learning, biclustering, bi-dimensional clustering, gene expression analysis, data mining.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A novel biclustering approach with iterative optimization to analyze gene expression data

OBJECTIVE With the dramatic increase in microarray data, biclustering has become a promising tool for gene expression analysis. Biclustering has been proven to be superior over clustering in identifying multifunctional genes and searching for co-expressed genes under a few specific conditions; that is, a subgroup of all conditions. Biclustering based on a genetic algorithm (GA) has shown better...

متن کامل

Biclustering of gene expression data

Biclustering is an important problem that arises in diverse applications, including the analysis of gene expression and drug interaction data. A large number of clustering approaches have been proposed for gene expression data obtained from microarray experiments. However, the results from the application of standard clustering methods to genes are limited. This limitation is imposed by the exi...

متن کامل

Improving performances of suboptimal greedy iterative biclustering heuristics via localization

MOTIVATION Biclustering gene expression data is the problem of extracting submatrices of genes and conditions exhibiting significant correlation across both the rows and the columns of a data matrix of expression values. Even the simplest versions of the problem are computationally hard. Most of the proposed solutions therefore employ greedy iterative heuristics that locally optimize a suitably...

متن کامل

The Iterative Signature Algorithm 3 5 . 4 The Iterative Signature Algorithm

As we have seen in previous lectures, the technology of DNA chips allows the measurement of mRNA levels simultaneously for thousands of genes. The results of DNA chip experiments are usually organized together in a gene expression matrix, with rows corresponding to genes and columns corresponding to conditions. A bicluster can be defined as a submatrix spanned by a subset of genes and a subset ...

متن کامل

The Iterative Signature Algorithm 3 8 . 3 The Iterative Signature Algorithm

As we have seen in previous lectures, the technology of DNA chips allows the measurement of mRNA levels simultaneously for thousands of genes. The results of DNA chip experiments are usually organized together in a gene expression matrix, with rows corresponding to genes and columns corresponding to conditions. A bicluster can be defined as a submatrix spanned by a subset of genes and a subset ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009